Cross-Validation and Minimum Generation Error based Decision Tree Pruning for HMM-based Speech Synthesis
نویسندگان
چکیده
This paper presents a decision tree pruning method for the model clustering of HMM-based parametric speech synthesis by cross-validation (CV) under the minimum generation error (MGE) criterion. Decision-tree-based model clustering is an important component in the training process of an HMM based speech synthesis system. Conventionally, the maximum likelihood (ML) criterion is employed to choose the optimal contextual question from the question set for each tree node split and the minimum description length (MDL) principle is introduced as the stopping criterion to prevent building overly large tree models. Nevertheless, the MDL criterion is derived based on an asymptotic assumption and is problematic in theory when the size of the training data set is not large enough. Besides, inconsistency exists between the MDL criterion and the aim of speech synthesis. Therefore, a minimum cross generation error (MCGE) based decision tree pruning method for HMM-based speech synthesis is proposed in this paper. The initial decision tree is trained by MDL clustering with a factor estimated using the MCGE criterion by cross-validation. Then the decision tree size is tuned by backing-off or splitting each leaf node iteratively to minimize a cross generation error, which is defined to present the sum of generation errors calculated for all training sentences using cross-validation. Objective and subjective evaluation results show that the proposed method outperforms the conventional MDL-based model clustering method significantly.
منابع مشابه
Minimum generation error criterion for tree-based clustering of context dependent HMMs
Due to the inconsistency between HMM training and synthesis application in HMM-based speech synthesis, the minimum generation error (MGE) criterion had been proposed for HMM training. This paper continues to apply the MGE criterion for tree-based clustering of context dependent HMMs. As directly applying the MGE criterion results in an unacceptable computational cost, the parameter updating rul...
متن کاملAutoregressive clustering for HMM speech synthesis
The autoregressive HMM has been shown to provide efficient parameter estimation and high-quality synthesis, but in previous experiments decision trees derived from a non-autoregressive system were used. In this paper we investigate the use of autoregressive clustering for autoregressive HMM-based speech synthesis. We describe decision tree clustering for the autoregressive HMM and highlight dif...
متن کاملMinimum generation error training with direct log spectral distortion on LSPs for HMM-based speech synthesis
A minimum generation error (MGE) criterion had been proposed to solve the issues related to maximum likelihood (ML) based HMM training in HMM-based speech synthesis. In this paper, we improve the MGE criterion by imposing a log spectral distortion (LSD) instead of the Euclidean distance to define the generation error between the original and generated line spectral pair (LSP) coefficients. More...
متن کاملOverview of NIT HMM - based speech synthesis system for Blizzard Challenge 2010
This paper describes a hidden Markov model (HMM)-based speech synthesis system developed for the Blizzard Challenge 2010. This system employs STRAIGHT vocoding, minimum generation error (MGE) training, minimum generation error linear regression (MGELR) based model adaptation, the Bayesian speech synthesis framework, and the parameter generation algorithm considering global variance. The real-ti...
متن کاملGenerating intonation from a mixed CART-HMM model for speech synthesis
This paper proposes two algorithms for generating intonation from a mixed CART-HMM intonation model for speech synthesis. Based either on a Viterbi search or on the Expectation-Maximization algorithm, the two generation algorithms are analyzed in terms of likelihood and F0 Root Mean Square Error. Listening tests are performed to subjectively evaluate the quality of the generated intonation.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IJCLCLP
دوره 15 شماره
صفحات -
تاریخ انتشار 2010